Overcoming data locality: An in-memory runtime file system with symmetrical data distribution

نویسندگان

  • Alexandru Uta
  • Andreea Sandu
  • Thilo Kielmann
چکیده

In many-task computing (MTC), applications such as scientific workflows or parameter sweeps communicate via intermediate files; application performance strongly depends on the file system in use. The state of the art uses runtime systems providing in-memory file storage that is designed for data locality: files are placed on those nodes that write or read them. With data locality, however, task distribution conflicts with data distribution, leading to application slowdown, and worse, to prohibitive storage imbalance. To overcome these limitations, we present MemFS, a fully symmetrical, in-memory runtime file system that stripes files across all compute nodes, based on a distributed hash function. Our cluster experiments with Montage and BLAST workflows, using up to 512 cores, show that MemFS has both better performance and better scalability than the state-of-the-art, locality-based file system, AMFS. Furthermore, our evaluation on a public commercial cloud validates our cluster results. On this platform MemFS shows excellent scalability up to 1024 cores and is able to saturate the 10G Ethernet bandwidth when running BLAST and Montage. The contents of this chapter have been originally published in the Future Generation Computer Systems journal, volume 54, 2016 and have been slightly modified to improve readability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Study of Implicit Data Distribution Methods for OpenMP Using the SPEC Benchmarks

In contrast to the common belief that OpenMP requires data-parallel extensions to scale well on architectures with non-uniform memory access latency, recent work has shown that it is possible to develop OpenMP programs with good levels of memory access locality, without any extension of the OpenMP API. The vehicle for localizing memory accesses transparently to the programming model, is a runti...

متن کامل

ARS: an adaptive runtime system for locality optimization

Shared memory programs running on Non-Uniform Memory Access (NUMA) machines usually face inherent performance problems stemming from excessive remote memory accesses. A solution, called the Adaptive Runtime System (ARS), is presented in this paper. ARS is designed to adjust the data distribution at runtime through automatic page migrations. It uses memory access histograms gathered by hardware ...

متن کامل

Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP

ÐExploiting cache locality of parallel programs at runtime is a complementary approach to a compiler optimization. This is particularly important for those applications with dynamic memory access patterns. We propose a memory-layout oriented technique to exploit cache locality of parallel loops at runtime on Symmetric Multiprocessor (SMP) systems. Guided by application-dependent and targeted ar...

متن کامل

A Transparent Runtime Data Distribution Engine for OpenMP

This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and minimization of the rate of remote memory accesses are critical for sustaining high performance on these systems. We show that due to the low remote-to-local memory access latency ratio of contempo...

متن کامل

Effective Task Binding in Work Stealing Runtimes for Numa Multi-core Processors

Modern server processors in high performance computing consist of multiple integrated memory controllers on-chip and behave as NUMA in nature. Many user level runtime systems like Open MP, Cilk and TBB provide task construct for programming multi core processors. Task body may define the code that can access task local data and also shared data. Most of the shared data is scattered across virtu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Future Generation Comp. Syst.

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2016